Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When C14N fails try comparison without canonicalization #88

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

eclipxe13
Copy link

The issue #87 points out that in some edge cases the C14N can fail.
It is not a common case, even psalm don't recognize that c14n() can return false.

This problem can be reproduced by setting a namespace uri value as a defined xml entity.

<!DOCTYPE foo SYSTEM "foo.dtd" [
  <!ENTITY ns_foo "http://uri.tld/foo">
]>
<i:foo xmlns:i="&ns_foo;"/>

I append two cases to tests when C14N fails (using current providers):

  • Identicals documents must assert equal
  • Different documents must throw ComparisonFailure

To solve it, I separate @$document->loadXML($node->C14N()) into two steps: $c14n = @$node->C14N() and @$document->loadXML($c14n).
If $c14n === false then retry node to text conversion but without using canonicalization.

Observations:

  • This is an edge case, C14N failing is very uncommon.
  • DOMNodeComparator::nodeToText is always called with $canonicalize = true, DOMNodeComparator::assertEquals ignored the $canonicalize argument.
  • The comparison on this cases (when C14N fails) is more strict, since canonicalization is not performed.
  • Didn't remove the silent operator, and extended it to c14n() call. This is preserving the same behavior as before.

If you think is necessary I can try other approaches:

  • Node canonicalization and xml loading capturing libxml errors, then throw an exception.
    This will require to add another exception that contains the libxml_get_errors() array, or,
    implode the errors into a single message and use the current SebastianBergmann\Comparator\RuntimException.

  • If convert the node to text fails then throw a ComparisonFailure. This is like saying that $expected is different than $actual because the comparator is unable to perform the comparison.

  • Change DOMNodeComparator::acept() and only return true if objects are DOMNode and the canonicalization don't fail. I don't like this approach because canonicalization can be very expensive.

@codecov
Copy link

codecov bot commented Jul 17, 2020

Codecov Report

Merging #88 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #88   +/-   ##
=========================================
  Coverage     99.30%   99.31%           
- Complexity      125      127    +2     
=========================================
  Files            15       15           
  Lines           288      291    +3     
=========================================
+ Hits            286      289    +3     
  Misses            2        2           
Impacted Files Coverage Δ Complexity Δ
src/DOMNodeComparator.php 100.00% <100.00%> (ø) 12.00 <0.00> (+2.00)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0e7c210...c0f0b58. Read the comment docs.

@eclipxe13
Copy link
Author

@sebastianbergmann would you please review this PR or recommend another approach?

@sebastianbergmann
Copy link
Owner

I am sorry but I did not have time yet to look at this. Maybe @theseer can have a look?

@eclipxe13
Copy link
Author

@theseer would you have the chance to take a look on this PR?
I would like to address this issue, I can try another approach if you think is better.
Thanks in advance!

src/DOMNodeComparator.php Outdated Show resolved Hide resolved
src/DOMNodeComparator.php Outdated Show resolved Hide resolved
tests/DOMNodeComparatorTest.php Outdated Show resolved Hide resolved
@theseer
Copy link

theseer commented Aug 4, 2020

The more I look at the implementation and how it's growing I'm wondering if it needs to be that complex.

Another edge case for instance we're not dealing with currently is in line 86: We now catch the edge case of an empty string returned from c14n for $node not being within a document but here we simply assume that $node->ownerDocument will not be null. But it can if we do not call c14n.

Again, do we really need all this complexity?

@eclipxe13
Copy link
Author

Again, do we really need all this complexity?

I don't know. I think that all this complexity is caused by:
a) accept any DOMNode or DOMDocument, no matter when they are orphans.
b) transform a DOMNode or DOMDocument to string using a single private method.

Is there any way for DOMNodeComparator abort because samples are too complex to transform in string format?
I think that an exception is better than a false positive.

There are other issues in the class, like the one you just point out, also PR #71, to test nodeToText() you must create a comparison between two elements, secondary effects on the nodes under testing, assertEquals does not honor $canonicalize parameter.

@eclipxe13
Copy link
Author

@theseer are you interested on this PR? what improvements can I made?
I think a "no" is just fine. Just to know if I have more to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants